AITopics | adaptive optimizer

Collaborating Authors

adaptive optimizer

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient Adaptive Federated Optimization

Neural Information Processing SystemsJun-23-2026, 03:52:58 GMT

Adaptive optimization is critical in federated learning, where enabling adaptivity on both the server and client sides has proven essential for achieving optimal performance. However, the scalability of such jointly adaptive systems is often hindered by resource limitations in communication and memory. In this paper, we introduce a class of efficient adaptive algorithms, named FedAda2 and its enhanced version FedAda2++, designed specifically for large-scale, cross-device federated environments.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

8f9d459c19b59b5400ce396e0f8c23e0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 20:41:39 GMT

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(13 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Birder: Communication-Efficient 1-bit Adaptive Optimizer for Practical Distributed DNN Training

Neural Information Processing SystemsFeb-15-2026, 10:41:34 GMT

Therefore, from a system-level perspective, the design ethos of a system-efficient communication-compression algorithm is that we should guarantee that the compression/decompression of the algorithm is computationally light and takes less time, and it should also be friendly to efficient collective communication primitives.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.93)

Add feedback

Making Scalable Meta Learning Practical

Neural Information Processing SystemsFeb-11-2026, 21:21:56 GMT

Meta learning aims to learn the inductive biases ( e .

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Momentum Centering and Asynchronous Update for Adaptive Gradient Methods

Neural Information Processing SystemsDec-25-2025, 04:41:24 GMT

We propose ACProp (Asynchronous-centering-Prop), an adaptive optimizer which combines centering of second momentum and asynchronous update (e.g. for $t$-th update, denominator uses information up to step $t-1$, while numerator uses gradient at $t$-th step). ACProp has both strong theoretical properties and empirical performance. With the example by Reddi et al. (2018), we show that asynchronous optimizers (e.g. AdaShift, ACProp) have weaker convergence condition than synchronous optimizers (e.g. Adam, RMSProp, AdaBelief); within asynchronous optimizers, we show that centering of second momentum further weakens the convergence condition. We demonstrate that ACProp has a convergence rate of $O(\frac{1}{\sqrt{T}})$ for the stochastic non-convex case, which matches the oracle rate and outperforms the $O(\frac{logT}{\sqrt{T}})$ rate of RMSProp and Adam. We validate ACProp in extensive empirical studies: ACProp outperforms both SGD and other adaptive optimizers in image classification with CNN, and outperforms well-tuned adaptive optimizers in the training of various GAN models, reinforcement learning and transformers. To sum up, ACProp has good theoretical properties including weak convergence condition and optimal convergence rate, and strong empirical performance including good generalization like SGD and training stability like Adam.

acprop, momentum centering and asynchronous update, optimizer, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

DP-MicroAdam: Private and Frugal Algorithm for Training and Fine-tuning

Hudişteanu, Mihaela, Kalinin, Nikita P., Cyffers, Edwige

arXiv.org Artificial IntelligenceDec-1-2025

Adaptive optimizers are the de facto standard in non-private training as they often enable faster convergence and improved performance. In contrast, differentially private (DP) training is still predominantly performed with DP-SGD, typically requiring extensive compute and hyperparameter tuning. We propose DP-MicroAdam, a memory-efficient and sparsity-aware adaptive DP optimizer. We prove that DP-MicroAdam converges in stochastic non-convex optimization at the optimal $\mathcal{O}(1/\sqrt{T})$ rate, up to privacy-dependent constants. Empirically, DP-MicroAdam outperforms existing adaptive DP optimizers and achieves competitive or superior accuracy compared to DP-SGD across a range of benchmarks, including CIFAR-10, large-scale ImageNet training, and private fine-tuning of pretrained transformers. These results demonstrate that adaptive optimization can improve both performance and stability under differential privacy.

artificial intelligence, dp-microadam, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.20509

Genre: Research Report > New Finding (0.66)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Security & Privacy (0.93)

Add feedback

A Tale of Two Geometries: Adaptive Optimizers and Non-Euclidean Descent

Xie, Shuo, Wang, Tianhao, Wu, Beining, Li, Zhiyuan

arXiv.org Artificial IntelligenceNov-26-2025

Adaptive optimizers can reduce to normalized steepest descent (NSD) when only adapting to the current gradient, suggesting a close connection between the two algorithmic families. A key distinction between their analyses, however, lies in the geometries, e.g., smoothness notions, they rely on. In the convex setting, adaptive optimizers are governed by a stronger adaptive smoothness condition, while NSD relies on the standard notion of smoothness. We extend the theory of adaptive smoothness to the nonconvex setting and show that it precisely characterizes the convergence of adaptive optimizers. Moreover, we establish that adaptive smoothness enables acceleration of adaptive optimizers with Nesterov momentum in the convex setting, a guarantee unattainable under standard smoothness for certain non-Euclidean geometry. We further develop an analogous comparison for stochastic optimization by introducing adaptive gradient variance, which parallels adaptive smoothness and leads to dimension-free convergence guarantees that cannot be achieved under standard gradient variance for certain non-Euclidean geometry.

artificial intelligence, machine learning, smoothness, (17 more...)

arXiv.org Artificial Intelligence

2511.20584

Country: North America > United States (0.27)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback